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In the context of emotion information processing, several studies have demonstrated the 
involvement of the amygdala in emotion perception, for unimodal and multimodal stimuli. 
However, it seems that not only the amygdala, but several regions around it, may also play 
a major role in multimodal emotional integration. In order to investigate the contribution 
of these regions to multimodal emotion perception, five patients who had undergone 
unilateral anterior temporal lobe resection were exposed to both unimodal (vocal or visual) 
and audiovisual emotional and neutral stimuli. In a classic paradigm, participants were 
asked to rate the emotional intensity of angry, fearful, joyful, and neutral stimuli on visual 
analog scales. Compared with matched controls, patients exhibited impaired categorization 
of joyful expressions, whether the stimuli were auditory, visual, or audiovisual. Patients 
confused joyful faces with neutral faces, and joyful prosody with surprise. In the case of 
fear, unlike matched controls, patients provided lower intensity ratings for visual stimuli than 
for vocal and audiovisual ones. Fearful faces were frequently confused with surprised ones. 
When we controlled for lesion size, we no longer observed any overall difference between 
patients and controls in their ratings of emotional intensity on the target scales. Lesion 
size had the greatest effect on intensity perceptions and accuracy in the visual modality, 
irrespective of the type of emotion. These new findings suggest that a damaged amygdala, 
or a disrupted bundle between the amygdala and the ventral part of the occipital lobe, has a 
greater impact on emotion perception in the visual modality than it does in either the vocal 
or audiovisual one. We can surmise that patients are able to use the auditory information 
contained in multimodal stimuli to compensate for difficulty processing visually conveyed 
emotion. 
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INTRODUCTION 

The ability to decode emotional information is crucial in every- 
day life, allowing us to adapt our behaviors when confronted with 
salient information, both for survival and for social adaptiveness 
purposes. The emotional features of objects in the environment 
have been shown to bring about an increase in the neuronal 
response, compared with the processing of non-emotional infor- 
mation (for a review, see Phan et al, 2002). The role that different 
brain regions play in decoding emotional information appears to 
depend on the modality. Furthermore, research has shown that 
both the primary and secondary sensory regions are modulated 
by emotion. For example, visual extrastriate regions are modu- 
lated by emotions conveyed by facial expressions (e.g., Morris et al., 
1998; Pourtois et al., 2005a; Vuilleumier and Pourtois, 2007), while 
temporal voice-sensitive areas have been shown to be modulated 
by emotional prosody (e.g., Mitchell et al., 2003; Grandjean et al., 
2005; Schirmer and Kotz, 2006; Wildgruber etal, 2006; Fruhholz 
etaL,2012). 



According to Haxby's face perception model (Haxby etal., 
2000), visual information is processed along a ventral pathway 
leading from the primary visual cortex (VI) to the fusiform 
face area (FFA) and inferior temporal cortex (ITC). Face percep- 
tion is sufficient to activate the FFA (see, for example, Pourtois 
etal, 2005a; Kanwisher and Yovel, 2006; Pourtois etal, 2010), 
but the activity of this structure is enhanced when the facial 
information is emotional (see, for example, Breiter etal., 1996; 
Dolan etal, 2001; Vuilleumier etal, 2001; Williams etal, 2004; 
Vuilleumier and Pourtois, 2007). Another structure whose activ- 
ity increases when decoding emotional facial information is the 
amygdala (see, for example, Haxby et al, 2000; Calder and Young, 
2005; Phelps and LeDoux, 2005; Adolphs, 2008). In monkeys, 
this structure has been shown to project to almost every step 
along the visual ventral pathway (Amaral etal., 2003). Human 
studies, meanwhile, have suggested that connectivity between 
the amygdala and the FFA is modulated by emotion percep- 
tion (Morris etal, 1998; Dolan etal, 2001; Vuilleumier etal. 
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2004; Sabatinelli etal, 2005; Vuilleumier, 2005; Vuilleumier and 
Pourtois, 2007). 

Regarding the amygdala's role in emotion perception, the cur- 
rent hypothesis is that this structure detects salience, a general 
feature of emotion (for a discussion, see Sander etal., 2003; 
Armony, 2013; Pourtois etal, 2013), through reciprocal connec- 
tions with the cortex (Amaral etal., 2003). Its main function is 
to facilitate attention and perception processing (e.g., Armony 
and Ledoux, 1997, 1999; Whalen, 1998; Vuilleumier etal, 2001) 
without explicit voluntary attention (for a review, see Vuilleu- 
mier and Pourtois, 2007). According to Ledoux's (2007) model, 
the amygdala's output is directed both to regions that modulate 
bodily responses (via the endocrine system related to the auto- 
nomic system), and to the primary and associative cortices. These 
encompass regions modulated by emotion such as in the extrastri- 
ate visual system, the FFA for face perception, and the voice area in 
the superior temporal gyrus (STG; including the primary auditory 
region). 

Further insight into emotional face perception and its subcor- 
tical bases has been provided by studies of patients with lesions 
of the amygdala. More specifically, studies have assessed patients 
with temporal lobe epilepsy whose lesions are linked either to 
the epileptogenic disease itself or else to its surgical treatment 
(see, for example, Cristinzio etal., 2007). These studies included 
patients with congenital or acquired diseases resulting in bilat- 
eral lesions, and patients with unilateral epilepsy arising from 
mesial temporal sclerosis who had undergone lobectomy with 
amygdalectomy. Patients with bilateral damage have been found 
to display impaired fearful face perception (Adolphs etal., 1994, 
1995; Young etal., 1995; Calder etal, 1996; Broks etal, 1998) 
and deficits in the perception of surprise and anger (Adolphs 
etal., 1994). Unilateral lesions have yielded either no differ- 
ences (Adolphs etal., 1995; prior surgery, Batut etal., 2006) 
or else a deficit for patients with right-sided lesions covering 
either a range of emotions (Anderson etal, 2000; Adolphs and 
Tranel, 2004) or solely fearful faces (prior surgery, Meletti et al., 
2003). Palermo etal. (2010) found that both left- and right- 
lesion groups exhibited a deficit in fear intensity perception, 
but the left-lesion group was more impaired for fear detection. 
Anterior temporal lobectomy with amygdalectomy is generally 
expected to affect the perceived intensity of facial emotional 
expressions. The functional explanation for this is a lack of modu- 
lation by the amygdala of the ventral visual processing network 
and, more specifically in the case of emotional faces, of the 
FFA. 

In addition to visual emotional information, the amygdala has 
been shown to be associated with different responses to emotional 
vocalizations. According to Schirmer etal. (2012), the process- 
ing of auditory information takes place along three streams in 
the temporal lobe: a posterior stream passing through the pos- 
terior part of the superior temporal sulcus (pSTS) for sound 
embodiment; a ventral stream directed toward the middle tempo- 
ral gyrus (MTG) for concept processing; and an anterior stream 
extending as far as the temporal pole (TmP) for the percep- 
tual domain (i.e., semantic processing). Another specificity of 
emotional vocalization perception is the hemispheric specificity 
modeled by Schirmer and Kotz (2006). In their model, the left 



temporal lobe has a higher temporal resolution for processing 
information than the right hemisphere, and is more involved 
in linguistic signal processing (segmental information), with 
suprasegmental analysis taking place in the right hemisphere. The 
amygdala has been shown to be modulated by emotional vocaliza- 
tions, including onomatopoeia (e.g., Morris etal., 1999; Fecteau 
etal., 2007; Plichta etal., 2011), and emotional prosody consist- 
ing either of pseudowords (e.g., Grandjean etal., 2005; Sander 
etal., 2005; Frtihholz and Grandjean, 2012, 2013), or of words 
and sentences (e.g., Ethofer etal., 2006, 2009; Wiethoff etal, 
2009). 

In contrast to research on emotional face perception, studies 
of auditory emotion processing in patients with bilateral amyg- 
dala lesions have produced divergent results. Some have failed to 
find any effect at all on emotion recognition (semantically neutral 
sentences: Adolphs and Tranel, 1999; names and onomatopoeia: 
Anderson and Phelps, 1998). Others have reported either a general 
impairment (counting sequences: Brierley et al., 2004) or specific 
impairments for fear (semantically neutral sentences: Scott et al., 
1997; non-verbal vocalizations: Dellacherie etal., 2011), surprise 
(DeUacherie etal., 2011), anger (Scott etal., 1997), or sadness per- 
ception (musical excerpt: Gosselin etal., 2007). There is a similar 
divergence for unilateral lesions, with either no effects (Adolphs 
and Tranel, 1999; Adolphs etal., 2001) or a specific impairment for 
fear (counting sequences: Brierley et al., 2004; meaningless words: 
Sprengelmeyer etal., 2010; non-verbal vocalizations: Dellacherie 
et al. , 20 1 1 ) . To sum up current knowledge about auditory emotion 
processing, there is a strong hypothesis about right hemispheric 
involvement for emotional prosody. The amygdala appears to be 
involved in prosody perception, but may also be sensitive to the 
proximal context of the stimulus presentation (for a discussion, 
see Frtihholz and Grandjean, 2013). 

In the case of face-voice emotion integration, studies featuring 
audiovisual emotional stimuli have replicated the response facilita- 
tion effect at the behavioral level, namely an increase in perceptual 
sensitivity and reduced reaction times (e.g., Massaro and Egan, 
1996; De Gelder and Vroomen, 2000; Dolan etal, 2001; Kreifelts 
et al., 2007), thathas already been demonstrated in non-emotional 
studies (e.g.. Miller, 1982; Schroger andWidmann, 1998). Respon- 
sibility for the behavioral improvement has been mainly attributed 
to various cortical substrates, including the left MTG (e.g., Pour- 
tois etal., 2005b), the posterior STG (pSTG; e.g., Ethofer etal., 
2006; Kreifelts etal., 2007), and, interestingly, the amydala, either 
bilaterally (e.g., Klasen et al., 201 1 ) or the left side (e.g., Dolan et al., 
2001; Ethofer et al, 2006; MuUer et al, 2012). Animal studies have 
yielded a more detailed multimodal model, with different levels 
of integration. For instance, a rhinal cortex lesion, as opposed to 
a direct lesion of the amygdala, is sufficient to disrupt associative 
mechanisms (Goulet and Murray, 2001). Meanwhile, a compari- 
son of the roles of the perirhinal cortex (PRC) and the pSTS led 
Taylor et al. (2006) to suggest that the pSTS plays a presemantic 
integration role, while the PRC integrates higher level conceptual 
representations. 

In summary, studies of the amygdala's modal specificity have 
reported impairments in patients with temporal lobectomy or spe- 
cific amygdalectomy for faces and either voices (Scott etal., 1997; 
Sprengelmeyer etal, 1999; Brierley etal., 2004) or emotion in 
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music (Gosselin etal, 2007, 2011). However, some patients seem 
to have a specific deficit for visual emotional stimuli (Adolphs 
et al, 1994, 2001; Anderson and Phelps, 1998; Adolphs and Tranel, 
1999). Discrepancies between studies have been explained by a 
number of different factors, including the date of epilepsy onset 
(e.g., McClelland et al., 2006), the nature and context of the stim- 
uli (e.g., face presentation duration; Graham etal, 2007; Palermo 
etal., 2010). The fear specificity of amygdala processing has also 
been strongly called into question (for a discussion, see Cahill et al. , 
1999; Murray, 2007; Morrison and Salzman, 2010). To the best of 
our knowledge, however, the role of lesion size has not been taken 
in account thus far. 

Our aim in the present study was to test whether the cat- 
egorization and intensity perception of unimodal (i.e., either 
visual or non-verbal auditory emotional stimuli), as opposed 
to bimodal (i.e., audiovisual) emotional stimuli is modified in 
patients who have undergone unilateral temporal anterior lobec- 
tomy with amygdalectomy. The impact of anterior temporal lobe 
ablation is assumed to differ with modality. Regarding the auditory 
network, above and beyond the absence of voice area modulations 
owing to amygdala resection, Schirmer etal. (2012) suggests that 
the anterior temporal lobe is more involved in semantic processing, 
representing the final temporal step before the processing shifts to 
the frontal regions associated with emotion evaluation. We would 
therefore expect disruption of this input to have an impact on cate- 
gorization, with patients making more mistakes or confusing more 
items than matched controls. For the visual modality, we would 
expect to find the same kind of deficit, stemming from the lack 
of emotion-related modulation of visual cortical input. Finally, 
for audiovisual material, we would expect to observe either a bet- 
ter preserved ability for correct detection and perceived intensity, 
if an intact pSTS and a more dorsal pathway toward the frontal 
lobe are sufficient to integrate audiovisual information, or no 
improvement because of the PRC lesion. 

Participants rated the intensity of brief onomatopoeic vocal- 
izations produced by actors (Banziger etal., 2012) and animated 
synthetic faces (Roesch etal, 2011) on visual analog scales. At 
the group level, we expected the patients to have a higher error 
rate than controls when it came to identifying unimodal emo- 
tional stimuli. This has been shown to be the case in the visual 
modality for fearful faces (bilateral lesion: Adolphs etal., 1994, 
1995; Young etal, 1995; Calder etal, 1996; Broks etal., 1998; 
unilateral lesion: Anderson etal., 2000; McClelland etal, 2006), 
and in the auditory modality for both fearful voices (bilateral 
lesion: Scott etal., 1997; Adolphs and Tranel, 2004; unilateral 
lesion: Scott etal., 1997; Brierley etal., 2004; Sprengelmeyer etal., 
2010; DeUacherie etal, 2011) and angry voices (bilateral lesion: 
Scott etal., 1997). For the audiovisual stimuli, we expected to 
observe a higher error rate for fear identification, arising from 
the combined effects of the unimodal deficits in each modality. 
Regarding intensity perception, we expected to observe simi- 
lar patterns, even after controlling for the extent of the lesion 
along the ventral pathway. Finally, we investigated the effects 
of lesion size on emotion recognition. We predicted that per- 
ception of emotion intensity would be modulated by the size 
of the lesion, with more extensive lesions resulting in impair- 
ment at different levels of information processing. We developed 



an additional hypothesis to explain the discrepant findings of 
previous studies. 

MATERIALS AND METHODS 
PARTICIPANTS 

We recruited five patients who had undergone unilateral antero- 
medial temporal lobectomy together with the unilateral removal 
of the amygdala. One patient (JP) had a lesion that extended to 
the occipital and posterior parietal lobes. The surgery had been 
performed to control the patients' medically intractable seizures 
(see Figure 1 for the location and extent of their lesions): four on 
the left side (FB, 23 years old; CG, 37 years old; JP, 45 years old; 
and RS, 62 years old) and one on the right (CM, 31 years old). CG 
was the only woman in the patient group, and FB the only left- 
handed patient. Controls were recruited via local advertisements: 
12 were matched with FB, CM, and CG for sex, handedness, and 
age; six with JP; and three with RS (see Table 1 for a summary and 
Table 2 for a detailed description of each patient). Patients did 
not exhibit any gnosis deficit in their respective neuropsycholog- 
ical tests. The study was approved by the local ethics committee, 
and all the participants gave their written informed consent. The 
controls received financial compensation (CHF 15) for taking part 
in the experiment. 

LESION DELIMITATION AND DESCRIPTION 

In order to compute the lesion size of each patient, anatomical 
images were segmented and normalized using a unified segmenta- 
tion approach (Ashburner and Friston, 2005) together with the 
Clinical toolbox^. Because of the cost function masking pur- 
pose (Andersen etal., 2010), lesion masks drawn on the patients' 
anatomical scans were included in the brain segmentation. Struc- 
tural images and lesion masks were normalized to MNI space 
with the DARTEL toolbox, using individual flow fields, which 
were estimated on the basis of the segmented gray (CM) and 
white matter (WM) tissue classes. The normalized lesion masks 
were used to calculate the lesion size for each patient in standard 
space. 

CG had a left anterior temporal lesion with an intact infe- 
rior temporal gyrus (ITG) and lateral occipitotemporal gyrus 
(LOTG). The lesion area included the periamygdaloid cortex 
(PAM), entorhinal cortex (Ent), medial occipitotemporal gyrus 
(MOTG), inferior part of the hippocampus (Hi), parahippocam- 
pal gyrus (PHG), and amygdala, and ended in the lateral anterior 
portion of the temporal lobe, in the MTG and TmP. 

CM had a right anterior temporal lesion extending to the mid- 
dle and ventromedial part of the temporal lobe, including the 
inferior temporal pole (ITmP), ITG, Ent, PAM, PRC, amygdala, 
inferior Hi, STG, anterior fusiform gyrus (FuG), and rhinal sul- 
cus. In the posterior part of the lesion, the PPo (planum polare), 
STG and STS were intact. 

FB had a left anterior temporal lesion that included the TmP, 
MTG, MOTG, Ent, Hi, PAM, amygdala, anterior STG, and poste- 
rior temporal cortex (PTe). The lesion ended in two separate tails: 
one in the lateral anterior part of the temporal lobe, the other in 
the medial part. 



^ http://www.mccauslandcenter.sc.edu/CRNL/clinical-toolbox 
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FIGURE 1 I (A) Anatomical images of the lesions for each patient: each lesion was delineated manually on the axial plane and corrected using the coronal plane. 
(B) Probability map for the normalized lesion size. 



Table 1 | Participants. 



Patient 


Lateralization 


Age 
(years) 


Sex 


Lesion (vx) (volume, 
normalized volume) 


Controls n 


IVIean age (SD) 
of controls 


CG 


Right-handed 


37 


Female 


Left (31 '564, 40' 683) 


12 


33.42 (3.63) 


CM 


Right-handed 


31 


Male 


Right (36'836, 48'833) 


12 


32.83 (2.48) 


FB 


Left-handed 


23 


Male 


Left (21 '596, 30'208) 


12 


22.08 (1.88) 


JP 


Right-handed 


45 


Male 


Left (434'284, 113'254) 


6 


45.17 (2.32) 


RS 


Right-handed 


62 


Male 


Left (44'011, 53' 920) 


3 


60 (4) 



JP had an extended left lateral resection including the temporal, 
frontal, parietal, and occipital lobes. The temporal part included 
the TmP, MTG, Ent, MOTG, ITG, PTe, anterior STG, and PHG. 
The frontal part included the lateral inferior and superior frontal 
gyri, precentral gyrus and postcentral gyrus. Finally, part of the 



lateral superior posterior occipital gyrus had been removed, but 
the FFA was intact. 

RS had a left anterior temporal lesion encompassing the 
TmP, STG, MTG, MOTG, ITG, FuG, amygdala, anterior Hi, Ent, 
PAM, FuG and anterior PHG. It ended in the lateral anterior 
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Table 2 | Patient description. 



Patient 


Years since 


Age at epilepsy 


Comorbidity 


Diagnosis 




lobectomy 


onset 






CG 


5 


1 2 ysars 




[_gf^ temporal partial complex epilepsy with hippocampal sclerosis 


CM 


6 


11 years 


Anxiety 


Left hemiplegia with right hemisphere hypoplasia 


FB 


10 






Major left hippocampal sclerosis 


JP 


11 


5 years 


Depressed feeling (at time 


Left temporal epilepsy 








of surgery) 




RS 


7 


6 months 




Left hippocampal sclerosis 



part. See Figure 1 for visual descriptions of the patients' brain 
damage. 

STIMULI AND PROCEDURE 

Non-verbal auditory expressions were drawn from the vali- 
dated Geneva Multimodal Emotion Portrayal (GEMEP) corpus 
(Banziger etal., 2012). We selected angry, joyful, and fearful 
non-verbal sounds ("ah") produced by two male and two female 
actors, on the basis of the recognition rate established in a pre- 
vious pilot study. For the neutral stimuli, we chose the most 
neutrally rated vocal expressions produced by the same actors 
(neutrality rating: M = 26.5, SD = 15.67), and the fundamen- 
tal frequency was flattened using Praat (Boersma and Weenink, 
2011). Sounds were cut and/or stretched to achieve a duration of 
1 s (mean duration before time stretch = 0.92 s, SD = 0.30 s) with 
SoundForge^, and 0.025 s fade-ins and fade-outs were included 
using Audacity^. The dynamic faces were created with FACSGen 
(Roesch etal., 2011), which allows for the parametric manipula- 
tion of 3D emotional facial expressions according to the Facial 
Action Coding System (Ekman and Friesen, 1978). They were 
selected on the basis of results of a previous study in which 
participants assessed the gender and believability of each avatar 
(Roesch etal., 2011). The lips were animated to match the inten- 
sity contour of each different sound for both unimodal visual 
and audiovisual items. The action units (AUs) for each emotion 
began at 0.25 s and ended at 0.75 s after onset, with their apex 
at 0.5 s (100% intensity). VirtualDub"* was used to generate the 
image sequences and to combine the voiced sounds with them 
at a rate of 26 frames per second (the final image was a dark 
screen). 

After signing the consent form, participants completed the 
behavioral inhibition system (BIS)/behavioral approach system 
(BAS) scales and the state trait anxiety inventory (STAI) on a 
web interface. They then rated the intensity of 216 items in 
unimodal [auditory (A), or visual (V)] and audiovisual (AV; 
congruent: same information in both modalities; incongruent: 
one modality emotional, the other neutral) conditions. The uni- 
modal and congruent audiovisual stimuli could either express 



^http://www.sonycreativesoftware.com/soundforge 

'''http://audacity.sourceforge.net/ 

''http://www.virtualdub.org/ 



the emotions of anger, fear, or joy, or be neutral (control con- 
dition). Each condition (modality, emotion, or congruency) was 
repeated 12 times. Items were presented using E-Prime (standard 
v2.08.90^) in a pseudorandomized order to avoid repetition of the 
same stimulus (i.e., synthetic face or actor's voice) or condition. 
The participants gave their answers by clicking on a continu- 
ous line between Not intense and Very intense for six different 
emotions (disgust, joy, anger, surprise, fear, sadness), plus neu- 
tral. In each trial, they could provide ratings on one or more 
scales. At the end of the experiment, they completed a debriefing 
questionnaire. 

STATISTICAL ANALYSIS 

Since multiple intensity scales were used to collect the answers, 
our data mostly contained zero ratings. To assess the interactions, 
we therefore ran a zero-inflated mixed model on congruent tri- 
als only, using the glmmADMB package for R^. This allowed 
the excess zeroes and remaining values to be modeled as bino- 
mial responses, and modeled the distribution as a generalized 
linear model (GLM) following a negative binomial distribution. 
Main effects were tested for group (control vs. patient), modal- 
ity (audio, visual, audiovisual), and emotion (anger, fear, or 
joy, plus neutral). Contrasts were performed to test specific 
hypotheses. 

The first hypothesis we tested was a group effect for a spe- 
cific emotion on the target scale (e.g., fearful item ratings on the 
fear scale) for each modality (A, V, AV). Sex, age, and normal- 
ized lesion size were added as control variables. Participant and 
stimulus ID were added as random effects. A different model was 
run for each of the three emotions, plus neutral. Second, four 
different models, one for each emotion, plus neutral, were tested 
in order to compare the impact of the three different modalities 
in each group. For instance, for angry item ratings on the anger 
scale, the modalities were tested in pairs (AV-A, AV-V, A-V) for 
the patient group, and individually for the control group. For 
this second set of models, we added the same control and ran- 
dom variables as for the first model. The third model was run 
to investigate the lateralization effect of the lesion for a specific 
modality and a specific emotion, controlling for handedness, age, 
and sex, and with random effect variables for participant ID and 



^http://www.pstnet.com/eprime.cfm 
^http://www.r-project.org/ 
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Stimulus ID. Owing to the limited size of our patient sample, this 
comparison was of a purely descriptive and exploratory nature. 
In order to test whether the effects we found in the different 
modalities were perceptual or emotional, we ran a complemen- 
tary analysis to compare emotional versus neutral items in each 
modality and each group, adding age, sex, and normalized lesion 
size as control variables, and participant ID and stimulus ID as 
random effects. Intergroup effects were also tested for emotional 
versus neutral items in each modality (A, AV, V), with the same 
control variables. Finally, we tested the impact of lesion size by 
including the number of voxels in a separate linear model for 
each emotion and each modality. In this final set of models, 
random effect variables (participant ID and stimulus ID) were 
added. 

RESULTS 

CATEGORICAL RESPONSES 

Participants could rate the intensity of each item on six dif- 
ferent scales (anger, disgust, fear, surprise, joy, sadness, and 
neutral). For each item, we identified the scale with the high- 
est rating, and calculated a proportional corrected score for each 
participant (Heberlein etal., 2004; Dellacherie etal., 2011), by 
looking at how many other members of the participant's group 
(patient or control) had given the same response. This score could 
range from 0, meaning that nobody else in the group had cho- 
sen the same scale, to I, meaning that everyone in the group 
had chosen the same scale. This type of correction is used to 
weight labeling errors, bearing in mind that some errors are 
more correct than others. For instance, it is easier to confuse 
visual fear and surprise (see, for example, Etcoff and Magee, 
1992) than it is to confuse fear and anger, as the first two 
expressions share a number of AUs. For vocal expressions, con- 
fusion is also possible, but between different pairs of emotions 
(see, for example, Banse and Scherer, 1996; Belin etal, 2008; 
Banziger etal., 2009). 

Using these corrected scores, we looked for possible dif- 
ferences between the two groups. As our data violated the 
assumptions of homoscedasticity and normal distribution, we ran 
non-parametric tests for multiple groups. In order to pinpoint dif- 
ferences between the groups within a specific emotion in a specific 
modality, we used the Kruskal-Wallis test, calculating z scores and 
p values corrected for multiple comparisons of mean ranks (z'J. 
These multiple comparisons are summarized in Figure 2. The 
control group was more accurate than the patient group in recog- 
nizing joy, whether it was expressed vocally (z' = 3.02, p < 0.005), 
visually (z' = 5.17, p< 0.005), orbimodally (z' = i.l9,p < 0.005). 
Greater accuracy within the control group was also observed for 
visual anger (z' = 2.99, p < 0.005), vocal fear (z' = 2.78, p < 0.01) 
and - marginally - visual (z' = 1.69, p = 0.08) and bimodal fear 
(z' = 1.89, p = 0.058). Finally, a reverse group effect was observed 
for the neutral vocal (z' = 3.64, p < 0.001) and audiovisual 
(z' = 3.64, p < 0.001) stimuli. 

Finally, we tested the impact of lesion size on the corrected 
hit rate for emotion recognition. We ran supplementary analyses 
using a GLM to test this effect with the modality (A, AV, V) and 
emotion (anger, joy, fear, neutral) factors, and added the normal- 
ized lesion size as a covariate. The control variables were age, sex. 



and lateralization. We observed a significant linear relationship 
between normalized lesion size and corrected hit score for visual 
anger (z = -2.91, p < 0.005), visual joy (z = -2.37, p < 0.05) 
and visual neutral stimuli (z = —3.52, p < 0.001). All the linear 
regressions were negative, meaning that the more extensive the 
lesion, the lower the corrected score. We observed no such effect 
for fear in the visual modality, as patients did not recognize this 
emotion (their corrected score was equal to 0), confusing it with 
surprise. 

INTENSITY PERCEPTION 

Using a GLM, we first compared the two groups on each specific 
emotion in each specific modality, controlling for sex, age, and 
normalized lesion size, and adding participant and stimulus ID 
as random effects. No significant results were observed, even for 
the fear items. However, when we ran pairwise comparisons of 
the modalities for a specific emotion on its target scale and for 
a specific group, we did observe significant effects, especially for 
the three emotions (see Figure 3). Patients provided higher inten- 
sity ratings of audiovisual versus unimodal visual information for 
angry (z = -4.14, p < 0.001), joyful (z = -6.14, p < 0.001), 
fearful (z = -8A5,p< 0.001), and neutral (z= -5.61,p < 0.001) 
items. They also provided higher intensity ratings of auditory ver- 
sus visual information for the same emotions (anger: z = —4.14, 
p < 0.001; joy: z = -6.14, p < 0.001; fear: z = -8.45, p < 0.001; 
neutral: z = —5.61, p < 0.001). The differences between audiovi- 
sual and unimodal auditory information were not significant for 
any of the emotions (p > 0.15). In the control group, a slightly dif- 
ferent pattern emerged for anger and joy. Anger was given a higher 
intensity rating in the audiovisual condition than in either the 
auditory (z = -3.27, p < 0.001) or visual (z = - 10.93, p < 0.001) 
condition, and a higher rating in the auditory condition than 
in the visual one (z = —6.94, p < 0.001). For joy, audiovisual 
information was perceived of as more intense than visual infor- 
mation (z = —12.69, p < 0.001), but auditory information was 
given a higher intensity rating than both audiovisual informa- 
tion (z = 3.07, p < 0.005) and visual information (z = —15.58, 
p < 0.001). Finally, fear stimuli were rated as more intense in 
the audiovisual modality than in the visual one (z = —11.74, 
p < 0.001), and also more intense in the auditory modality than 
in the visual one (z = —12.33, p < 0.001). No significant differ- 
ences were observed between the modalities for neutral stimuli 
(p>0.4). 

In order to ascertain whether the results were perceptual or 
emotional, we tested another model contrasting emotional ver- 
sus neutral stimuli for each group and each modality (A, V, AV). 
Controls rated emotional auditory items as more intense than 
neutral auditory items (z = 5.61, p < 0.001), and this was also 
the case for audiovisual information (z = 5.15, p < 0.001). 
By contrast, the patients provided higher intensity ratings for 
neutral items than they did for emotional items in the audi- 
tory (z = —2.64, p < 0.01) and audiovisual (z = —2.42, 
p < 0.05) modalities. In the visual modality, patients (z = —3.56, 
p < 0.001) and controls (z = -8.40, p < 0.001) alike gave 
higher intensity ratings for neutral items than for emotional 
ones. When we compared the two groups on emotional and neu- 
tral items in each modality, we found that the patients rated 
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FIGURE 2 I Mean proportional corrected scores for patients and controls, taking modality and emotion into account (bars represent the standard 
error of the mean, *p < 0.05, **p < 0.005, ***p < 0.001). 



Anger Stimuli on Anger Scale 
* * 

































* 




* 












* 






























































A 


1 

AV 


1 

V 


A 


1 

AV 


1 

V 



Controls 



Patients 



Joy Stimuli on Joy Scale 
* * 



A AV V 

Controls 



A AV V 

Patients 



Fear Stimuli on Fear Scale 
* * 



AV V 
Controls 



AV V 
Patients 



Neutral Stimuli on Neutral Scale 



AV V 
Controls 



AV 
Patients 



FIGURE 3 I Boxplot of GLIVI results for intensity ratings of each emotion 
on the corresponding target scale. Each box corresponds to a specific 
modality (A: auditory, AV: audiovisual, V: visual) and a specific group (patients 



vs. controls). The difference between A and AV for the controls is almost 
invisible, as zero values data were included in the plot. "Indicates a significant 
difference between modalities (pairwise). 



the intensity of the neutral items more highly than the controls 
did in the auditory modality (z = 2.24, p < 0.025). For the 
audiovisual modality, the effect was only marginal (z = 1.86, 
p = 0.062). 

INTENSITY PERCEPTION AND LESION EFFECT 

We then assessed the impact of lesion lateralization for each spe- 
cific emotion in each specitic modality. In this GLM analysis, 
we compared the patients' ratings on the target scale according 
to the side of their lesion, controlling for handedness, age, and 
sex, and adding participant and stimulus ID as random effects 
(see Figure 4). The patient with a right lesion was found to 



provide higher intensity ratings than the patients with left lesions, 
but only for angry faces (z = —4.36, p < 0.001) and auditory 
joy (z = —3.23, p < 0.005). All other significant effects con- 
cerned the opposite relationship, namely, the patients with left 
lesions rated the items as more intense than the patient with 
a right lesion did. This was the case for visual joy (z = 3.19, 
p < 0.005), auditory fear (z = 8.29, p < 0.001), audiovisual fear 
(z = 8.23, p < 0.001), and audiovisual neutral items (z = 3.67, 
p < 0.001). 

When we added the normalized lesion size as a covariate and 
compared the interactions of perceived intensity and modality for 
a specific emotion on the target scale, we observed a massive effect 
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FIGURE 4 I Boxplot of GLM results for intensity ratings of each emotion on the corresponding target scale. Each box corresponds to a specific modality 
(A: auditory, AV: audiovisual, V: visual) and a specific patient subgroup (left lesion vs. right lesion). "Indicates a significant difference between left and right 
lesion conditions. 



in the visual modality across all emotions: the larger the lesion, 
the less intensely the patients perceived visual anger (z = —2.90, 
p < 0.005), visual joy (z = —2.79, p < 0.005), and visual fear 
(z = —2.96, p < 0.005). Neutral visual stimuli, however, failed to 
reach significance {p > 0.15). This relationship also held good for 
audiovisual joy (z = —2.15, p < 0.051), but no significant effects 
were observed either for other audiovisual expressions (angry, 
fearful or neutral), or for auditory stimuli {p > 0.15). 

DISCUSSION 
CATEGORICAL RESPONSES 

Our main goal was to investigate the relationship between emotion 
and modality, comparing patients who had undergone unilateral 
anterior temporal lobectomy and amydalectomy with a matched 
control group. Overall, proportional corrected scores revealed 
that patients detected joy less accurately across all modalities, 
in contrast to previous studies postulating that impairments are 
restricted to negatively valenced stimuli (e.g., Brierley et al., 2004) . 
In addition, the patients displayed deficits for auditory fear and 
visual anger. The massive effect we observed for decoding joy has 
several possible explanations. First, this effect could be associated 
with the amount of information needed for accurate decoding. 
For instance, Graham etal. (2007) reported that patients were 
impaired in categorizing emotional faces when these were only 
presented for a limited duration. In the auditory domain, timing 
is also a crucial feature for prosody decoding. In healthy individ- 
uals, researchers have shown that there is a positive correlation 
between the duration of the sound and the correct recognition of 
the vocal stimulus (PoUack etal., 1960; Cornew etal, 2010; Pell 



and Kotz, 2011). Furthermore, happy prosody needs a duration of 
at least 1 s to be decoded accurately (Pell and Kotz, 201 1), and our 
stimuli included 0.25 s fade-ins and fade-outs, thus reducing the 
amount of available information and its actual duration. The sec- 
ond explanation also concerns a lack of information. In the visual 
items, the lips were animated to match the intensity contour of 
each vocal stimulus, even in the unimodal visual condition. As 
a result, this manipulation may have had an impact on emotion 
recognition because the information needed to detect a smile was 
masked by the movement of the lips accompanying the vocaliza- 
tion. More specifically, the visual cues in the mouth region that are 
needed to detect joy (AU 12 - lip corner puUer) and anger (AUs 
0 - upper lip raise, 17 - chin raise, 23 - lip funnel, 24 - lip press) 
were less visible, and thus less salient. Although we expected fear 
perception accuracy to be poorer among patients than among con- 
trols across all the modalities, we found that it was only diminished 
for auditory stimuli, indicating that unilateral amygdala damage 
is not sufficient to impair fear recognition in the visual domain. 
Numerical differences in the confusion matrix (Table 3) suggest 
that the lack of an effect for visual information stemmed from 
the fact that fearful faces and faces expressing surprise were con- 
fused by both patients (62%) and controls (71%). This confusion 
between fear and surprise at the visual level is easily explained by 
the proximity of the AUs used to produce these emotional expres- 
sions. In actual fact, they differ by only two AUs: one in the brow 
region (AU 4 - brow lowerer), the other around the mouth (AU 
20 - lip stretcher). 

Interestingly, the patients were more accurate than controls 
in their detection of neutral expressions in both the auditory and 
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Table 3 | Confusion matrix. 
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1 
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11 
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21 
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5 


1 
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61 


1 
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2 


1 


16 


10 


1 
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24 
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28 


71 
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0 


2 
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68 
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10 
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83 


0 


0 


0 


79 


2 


2 


0 


40 


5 


0 


Neutral 
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37 


10 


5 


7 


62 
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5 



Percentage of responses for each target emotion in each modality on the six rating scales. Bold values indicate the percentage of correct responses for an emotion 
on the target scale. The "ambivalent" response category corresponds to high intensity ratings on more than one scale for the same emotion. 



audiovisual modalities. In this experiment, controls may have been 
biased toward emotional stimuli, in that 75% of items contained 
emotional information. They were therefore more driven to search 
for emotional cues in the faces. Assuming that emotion detection 
plays a functional role, we can surmise that it is less detrimen- 
tal to identify an object as emotional, than to miss information 
that could indicate a threat. One can also argue that the patients' 
emotion detection networks were less activated (expressed behav- 
iorally by emotional blunting) by emotional stimuli, meaning that 
a neutral item was more likely to be perceived of as non-emotional. 

INTENSITY PERCEPTION 

First, controls and patients alike provided lower intensity ratings 
for visual emotional items than for auditory or audiovisual ones. 
More specifically, the control group rated visual angry, joyful, 
and fearful items as significantly less intense, while the patient 
group gave significantly lower intensity ratings for all the visual 
items (both emotional and neutral), when lesion size was taken 
into account. This less intense perception of visual stimuli could 
be explained by the differing nature of the auditory (real human 
voices) and visual (synthetic faces created with FACSGen) items. 
Nevertheless, the control group exhibited specific patterns of 
intensity perception for auditory and audiovisual items, depend- 
ing on the emotion. In the case of anger, audiovisual items were 
perceived of as being more intense than unimodal auditory ones. 
This could be interpreted as an increase in the perceived poten- 
tial threat, driven by the redundant information in the bimodal 



condition, as we are hard-wired to attribute particular importance 
to threat-related signals in order to avoid danger more effectively 
(Marsh et al., 2005). For joy, we observed the opposite pattern, in 
that auditory joyful items were rated as more intense than audio- 
visual items. Finally, there was no difference between the intensity 
ratings provided for auditory and audiovisual fear items, either in 
the control group or in the patient group. It seems, therefore, that 
anterior temporal lobe lesions disrupt the processing not just of 
fear-related stimuli, but also of other emotions in the visual modal- 
ity. An additional analysis comparing emotional and neutral items 
showed that patients produced higher intensity ratings for neu- 
tral items than for emotional ones, regardless of modality. This 
effect across modalities lends further weight to the assumption 
of emotional blunting among these patients. When we compared 
the groups on emotional and neutral items for each modality, we 
found that differences only showed up in the auditory and audio- 
visual modalities, with higher ratings for neutral items provided 
by patients compared with controls. It is not entirely clear whether 
the lesions alone were responsible for this effect or whether a more 
general dysfunction of the epileptic brain was to blame, although 
the correlations between lesion size and emotional judgments sug- 
gest that the lesions themselves had an impact, beyond a general 
epileptic effect. 

The present data indicate that the anterior temporal lobe plays 
a variety of roles, depending on the modality. First, patients exhib- 
ited a greater deficit in intensity perception for the visual modality 
as a linear function of lesion size for all emotional expressions. This 
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result highlights an important role of this region at the end of the 
ventral visual pathway, regardless of the nature of the emotional 
information. Second, modality had an impact on the ratings pro- 
vided by the controls for specific emotions. Anger, for instance, 
was perceived of as more intense in the auditory modality than 
in the audiovisual or visual ones, while joy was perceived of as 
more intense in the audiovisual modality than in the two uni- 
modal ones. This emotional modality preference has already been 
flagged up by Banziger etal. (2009). Until now, however, it has 
never been observed in patients. It could be linked to the deficit 
in the visual pathway mentioned earlier, as no differences were 
observed between the unimodal auditory condition and the audio- 
visual one, suggesting that the disruption of the visual processing 
channel meant that the processing focus had to be switched to the 
auditory modality. We can therefore hypothesize that our patients' 
audiovisual processing was impaired as a consequence of a lack of 
input from the visual pathway toward the anterior temporal lobe. 
Crossmodal integration in the PRC, an associative area in the ante- 
rior temporal lobe that has been highlighted in both animal (e.g., 
Goulet and Murray, 2001) and human (e.g., Taylor etal., 2006) 
studies, may therefore play a major role in audiovisual integration. 

INTENSITY PERCEPTION AND LESION EFFECT 

We expected the patient with right amygdala damage to exhibit 
a greater deficit than those with left damage, given that emotion 
perception decoding appears to be right-lateralized (e.g., Adolphs, 
2002; Schirmer and Kotz, 2006). Different deficit patterns were 
observed, however, depending on emotion and modality. The 
patient with a right temporal lesion displayed a deficit in auditory 
and audiovisual fear perception, along with a deficit in visual joy 
perception, while the left-lesion patients rated joyful prosody and 
angry visual expressions as less intense. These two last emotions 
can be seen as approach emotions, and BAS scores have been shown 
to correlate with activity in the left hemisphere (fiarmon- Jones 
and Allen, 1997; Coan and Allen, 2003). 

In addition to the lateralization effect, results highlighted a 
major impact of lesion size, mainly for the recognition and 
intensity ratings of visual emotional items. This massive visual 
impairment could be explained by the impact of the resection on 
part of the visual "what" (ventral) pathway: the absence or dis- 
ruption of this component of the visual pathway system may have 
had a greater effect because of the reduced cues for determining 
expressions in the visual stimuli (i.e., masking by Up movements 
matched with vocalizations). Based on prior research with animals 
(Ungerleider and Mishkin, 1982), Catani and Thiebaut de Schotten 
(2008) showed, using diffusion tensor imaging, that the inferior 
longitudinal fasciculus, a ventral associative bundle, connects the 
occipital and temporal lobes (more specifically, the visual areas) to 
the amygdala. Given that lesion size particularly seemed to affect 
the visual modality in our study, we can surmise that a compen- 
satory mechanism was at work, whereby the lack of discriminating 
information in a specific modality triggered a shift toward another 
modality (see, for example, Bavelier and Hirshorn, 2010). 

LIMITATIONS 

The first caveat regarding our experiment concerns the small 
number of patients, and the fact that only one patient had 



undergone a right anterior temporal resection, while another had 
a larger resection. However, the discrepancy between the number 
of patients and the number of controls did not impede our sta- 
tistical analysis, owing to our choice of model and the fact that 
we tested every model excluding Patients JP or CM to see if we 
observed any change, which was not the case. The more impor- 
tant point to take into consideration is the difference between the 
visual and auditory information. The sounds were taken from the 
GEMEP database, which features real human voices. By contrast, 
the visual stimuli were non-natural faces (i.e., avatars), and this 
difference could account for the increased difficulty in labeling the 
expressions, even though they matched the Ekman coding system 
(see FACSGen; Roesch etal, 2011). 

CONCLUSION 

The results revealed a visual deficit in the perceived intensity of 
emotional stimuli. This deficit was explained by lesion size, in 
that the larger the lesion, the lower the intensity ratings for the 
visual items. This could be caused by disruption to the visual 
pathway connecting the occipital lobe and the amygdala, but 
further investigation is needed to test this hypothesis. Further- 
more, emotional blunting effects may also have played a part, 
given that the neutral expressions were given higher intensity 
ratings by patients than by controls. It would be useful to deter- 
mine whether the absence of audiovisual enhancement in the 
patients' perception can be accounted for solely by the amygdala 
or whether the absence of the PRC, an area that has already been 
identified as an integrating area in both animals (Goulet and Mur- 
ray, 2001) and humans (Taylor etal., 2006), is also an important 
factor. 
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